You are looking at historical revision 1063 of this page. It may differ significantly from its current revision.

Data Representation

There exist two different kinds of data objects in the CHICKEN system: immediate and non-immediate objects. Immediate objects are represented by a tagged machine word, which is usually of 32 bits length (64 bits on 64-bit architectures). The immediate objects come in four different flavors:

fixnums, that is, small exact integers, distinguished by the lowest order bit in the machine word set to 1. This gives fixnums a range of 31 bits for the actual numeric value (63 bit on 64 bit architectures).

characters, where the lowest four bits of machine words containing characters are equal to C_CHARACTER_BITS. The ASCII code of the character is encoded in bits 9 to 16, counting from 1 and starting at the lowest order position.

booleans, where the lowest four bits of machine words containing booleans are equal to C_BOOLEAN_BITS. Bit 5 (counting from 0 and starting at the lowest order position) is one if the boolean designates true, or 0 if it is false.

other values: the empty list, void and end-of-file. The lowest four bits of machine words containing these values are equal to C_SPECIAL_BITS. Bits 5 to 8 contain an identifying number for this type of object. The following constants are defined: C_SCHEME_END_OF_LIST C_SCHEME_UNDEFINED C_SCHEME_END_OF_FILE

Non-immediate objects are blocks of data represented by a pointer into the heap. The first word of the data block contains a header, which gives information about the type of the object. The header has the size of a machine word, usually 32 bits (64 bits on 64 bit architectures).

bits 1 to 24 (starting at the lowest order position) contain the length of the data object, which is either the number of bytes in a string (or byte-vector) or the the number of elements for a vector or for a structure type.

bits 25 to 28 contain the type code of the object.

bits 29 to 32 contain miscellaneous flags used for garbage collection or internal data type dispatching. These flags are:

C_GC_FORWARDING_BIT
Flag used for forwarding garbage collected object pointers.
C_BYTEBLOCK_BIT
Flag that specifies whether this data object contains raw bytes (a string or byte-vector) or pointers to other data objects.
C_SPECIALBLOCK_BIT
Flag that specifies whether this object contains a special non-object pointer value in its first slot. An example for this kind of objects are closures, which are a vector-type object with the code-pointer as the first item.
C_8ALIGN_BIT
Flag that specifies whether the data area of this block should be aligned on an 8-byte boundary (floating-points numbers, for example).

The actual data follows immediately after the header. Note that block-addresses are always aligned to the native machine-word boundary. Scheme data objects map to blocks in the following manner:

pairs: vector-like object (type bits C_PAIR_TYPE), where the car and the cdr are contained in the first and second slots, respectively.

vectors: vector object (type bits C_VECTOR_TYPE).

strings: byte-vector object (type bits C_STRING_TYPE).

procedures: special vector object (type bits C_CLOSURE_TYPE). The first slot contains a pointer to a compiled C function. Any extra slots contain the free variables (since a flat closure representation is used).

flonum: a byte-vector object (type bits C_FLONUM_BITS). Slots one and two (or a single slot on 64 bit architectures) contain a 64-bit floating-point number, in the representation used by the host systems C compiler.

symbol: a vector object (type bits C_SYMBOL_TYPE). Slots one and two contain the toplevel variable value and the print-name (a string) of the symbol, respectively.

port: a special vector object (type bits C_PORT_TYPE). The first slot contains a pointer to a file- stream, if this is a file-pointer, or NULL if not. The other slots contain housekeeping data used for this port.

structure: a vector object (type bits C_STRUCTURE_TYPE). The first slot contains a symbol that specifies the kind of structure this record is an instance of. The other slots contain the actual record items.

pointer: a special vector object (type bits C_POINTER_TYPE). The single slot contains a machine pointer.

tagged pointer: similar to a pointer (type bits C_TAGGED_POINTER_TYPE), but the object contains an additional slot with a tag (an arbitrary data object) that identifies the type of the pointer.

Data objects may be allocated outside of the garbage collected heap, as long as their layout follows the above mentioned scheme. But care has to be taken not to mutate these objects with heap-data (i.e. non-immediate objects), because this will confuse the garbage collector.

For more information see the header file chicken.h.